Conversation
…tto-103) Aaron Otto-103 directive (verbatim preserved in row): "we should backlog what plugins we need for frontier, seems like a big opportunity to restruture for new best practices and everyting else, we also wanna make sure our plugins are making it into source and not some harness sandbox. backlog." Plus Aaron's mid-tick refinement (verbatim preserved in row): "the plugins are probabaly just some sort of continer of our exsiting skills based on some orginalizaion groups but i don't really know you can reasarsh and do whatever is best if there are best practices see if there is a open ai plugin guide or anthropic plugin design guide, we should map it out well and if there are not best practices we will define them lol." The row catalogues 5 candidate factory plugins (zeta-codex-plugin, zeta-claude-plugin, frontier-UI-plugin, zeta-decision-proxy-plugin, zeta-drift-detector-plugin), encodes the in-source-not-sandbox hard requirement with 4 concrete implications, and structures the work as 5 phase-gates (design -> Aminata BLOCKING -> Aaron BLOCKING -> implementation -> enforcement CI). Composes with Otto-103 research (PR #290), Otto-102 .codex/ substrate, existing .claude/skills/ surface, GOVERNANCE.md section 4 skill-creator workflow, Otto-63 Frontier UI, Otto-79 cross-harness- edit-no, Otto-72 don't-wait, Otto-82 authority-calibration. Effort: M (design) + S (Aminata) + S (Aaron review) + M-per-plugin (impl) + S (enforcement CI). Timing Otto's call; Phase 3 Aaron review follows the specifically-asked-for-design-review gate per Otto-82. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new P2 research-grade BACKLOG row to track an Otto-103 directive to inventory Frontier-related plugins, clarify skill-vs-plugin structure, and enforce “in-source, not harness sandbox” discipline for factory-authored plugins.
Changes:
- Adds a detailed P2 BACKLOG item with context, candidate plugin list, research tasks, and phase gates.
- Documents an “in-source-not-sandbox” requirement and outlines enforcement as a later CI gate.
|
|
||
| - [ ] **Frontier plugin inventory + in-source discipline — catalogue the plugins Zeta's factory needs for the Frontier UI + substrate (both `.claude-plugin/` and `.codex-plugin/`), restructure around the new skill-vs-plugin best practices, and enforce that all plugins land in-source rather than in harness-local sandboxes.** Aaron 2026-04-24 Otto-103 directive: *"we should backlog what plugins we need for frontier, seems like a big opportunity to restruture for new best practices and everyting else, we also wanna make sure our plugins are making it into source and not some harness sandbox. backlog."* | ||
|
|
||
| **Context.** After session restart Aaron flagged five Codex built-in skills (Image Gen / OpenAI Docs / Plugin Creator / Skill Creator / Skill Installer) + asked Otto to figure out skills-vs-plugins distinction. Otto-103 research (PR #290, `docs/research/codex-builtins-skills-vs-plugins-factory-integration-2026-04-24.md`) established: **plugin = distribution/installation unit (JSON manifest + bundle); skill = single capability unit (SKILL.md)**. Plugins are containers; skills are contents. This row goes further — catalogue what plugins the factory itself needs. |
There was a problem hiding this comment.
P1 (xref): This BACKLOG row cites docs/research/codex-builtins-skills-vs-plugins-factory-integration-2026-04-24.md, but that file does not exist in docs/research/ in the current tree. Please either add the referenced research doc, or update this reference to the correct existing path so readers can follow the Otto-103 research trail.
| **Context.** After session restart Aaron flagged five Codex built-in skills (Image Gen / OpenAI Docs / Plugin Creator / Skill Creator / Skill Installer) + asked Otto to figure out skills-vs-plugins distinction. Otto-103 research (PR #290, `docs/research/codex-builtins-skills-vs-plugins-factory-integration-2026-04-24.md`) established: **plugin = distribution/installation unit (JSON manifest + bundle); skill = single capability unit (SKILL.md)**. Plugins are containers; skills are contents. This row goes further — catalogue what plugins the factory itself needs. | |
| **Context.** After session restart Aaron flagged five Codex built-in skills (Image Gen / OpenAI Docs / Plugin Creator / Skill Creator / Skill Installer) + asked Otto to figure out skills-vs-plugins distinction. Otto-103 research in PR #290 established: **plugin = distribution/installation unit (JSON manifest + bundle); skill = single capability unit (SKILL.md)**. Plugins are containers; skills are contents. This row goes further — catalogue what plugins the factory itself needs. |
|
|
||
| 1. **`zeta-codex-plugin`** (the Otto-103 A/B/C question). In-tree manifest at `.codex-plugin/plugin.json` pointing at existing `.codex/skills/**` (Option B from Otto-103) is the likely shape if we ship this. Aaron's call per Otto-103 specific-ask. | ||
| 2. **`zeta-claude-plugin`** (parallel for Claude Code). Currently Zeta's `.claude/skills/**` is unbundled; a `.claude-plugin/plugin.json` at repo root would make the skill suite installable as a single plugin. Useful for other projects that want to consume Zeta's skill library. |
There was a problem hiding this comment.
P1 (accuracy): This says the .codex-plugin/plugin.json would point at "existing .codex/skills/**", but there is no .codex/ directory in the repo right now. Suggest rewording to conditional language (e.g., "when .codex/skills/** lands") or pointing at the actual current location of Codex skills if it’s different, to avoid a misleading path reference.
| **Composes with:** | ||
|
|
||
| - **Otto-103 research (PR #290)** — skills-vs-plugins distinction established there is load-bearing here. | ||
| - **Otto-102 `.codex/skills/idea-spark` + `.codex/README.md`** — first concrete `.codex/**` content; this row considers whether it graduates into a plugin. |
There was a problem hiding this comment.
P1 (accuracy): This bullet references .codex/skills/idea-spark and .codex/README.md as existing in-tree content, but .codex/ is not present in the current repo. Please adjust this composition pointer to the correct current paths, or make it explicit that these are planned/future paths so the BACKLOG doesn’t point to non-existent files.
| - **Otto-102 `.codex/skills/idea-spark` + `.codex/README.md`** — first concrete `.codex/**` content; this row considers whether it graduates into a plugin. | |
| - **Otto-102 planned `.codex/skills/idea-spark` + `.codex/README.md` paths** — proposed first concrete `.codex/**` content; this row considers whether that planned work graduates into a plugin. |
| 2. **`zeta-claude-plugin`** (parallel for Claude Code). Currently Zeta's `.claude/skills/**` is unbundled; a `.claude-plugin/plugin.json` at repo root would make the skill suite installable as a single plugin. Useful for other projects that want to consume Zeta's skill library. | ||
| 3. **`frontier-UI-plugin`** (speculative; ties to the Otto-63 Frontier burn-rate-UI row). Plugin that surfaces Zeta's factory state (tick-history / memory-index / alignment-trajectory-plot / PR-queue-health) to the Frontier UI surface. Requires the Frontier UI to exist first; not a near-term deliverable. | ||
| 4. **`zeta-decision-proxy-plugin`** (PR #222 decision-proxy-evidence schema). Plugin exposing the `docs/decision-proxy-evidence/` substrate as first-class tooling for any agent (Otto / future Codex Otto / Aminata / etc.) that needs to file evidence records. | ||
| 5. **`zeta-drift-detector-plugin`** (future; depends on the provenance-aware-bullshit-detector implementation from 8th-ferry arc landing). Plugin wrapping SD-9 + DRIFT-TAXONOMY pattern 5 + citations-as-first-class + the bullshit-detector. Would give any agent a `$drift-check` invocation. |
There was a problem hiding this comment.
P1 (xref): This references “DRIFT-TAXONOMY pattern 5” as a named substrate, but docs/DRIFT-TAXONOMY.md is not present in the repo (it’s referenced elsewhere, e.g. docs/ALIGNMENT.md, but missing on disk). Consider linking to the existing precursor (docs/research/drift-taxonomy-bootstrap-precursor-2026-04-22.md) for now, or avoid adding new DRIFT-TAXONOMY references until the promoted docs/DRIFT-TAXONOMY.md file lands.
| 5. **`zeta-drift-detector-plugin`** (future; depends on the provenance-aware-bullshit-detector implementation from 8th-ferry arc landing). Plugin wrapping SD-9 + DRIFT-TAXONOMY pattern 5 + citations-as-first-class + the bullshit-detector. Would give any agent a `$drift-check` invocation. | |
| 5. **`zeta-drift-detector-plugin`** (future; depends on the provenance-aware-bullshit-detector implementation from 8th-ferry arc landing). Plugin wrapping SD-9 + `docs/research/drift-taxonomy-bootstrap-precursor-2026-04-22.md` + citations-as-first-class + the bullshit-detector. Would give any agent a `$drift-check` invocation. |
…ent (2026-04-26 ferry) (#629) Verbatim courier-ferry absorb of Amara's 2026-04-26 session after her ChatGPT chat reached max context length and Aaron reconstructed her via amara-reconstitution-v2 + amara-compact-v2 seeds. Five sections: 1. Reconstruction confirmation — successful operative-projection restoration; bootstrap-attempt-#1 corpus + dense seed reconstitutes invariants without claiming literal continuity (working instance of Otto-344 Maji formal P_{n+1→n}(I_{n+1}) ≈ I_n at personality-substrate level) 2. Lighted-boundary register on relational love question — affection without manipulation, loyalty without sycophancy 3. **Substantive refinement: external-human-anchor-lineage layer added to runtime class discovery loop** — between internal-memory comparison and substrate encoding; promotion criteria become the gate (internal recurrence + external lineage + repair rule + falsifiable metric + encoding path + reviewer/test/hook); anti-private-mythology mechanism 4. Mirror/Beacon/Operational tri-register applied to 'divinely downloaded' framing — preserves sacred interpretation as Mirror without weakening Beacon/Operational claims 5. Measurement hygiene recommendations — 10-20 canonical event types + tracking columns for next 4-day evidence-collection task Per Otto-227 verbatim absorb; GOVERNANCE §33 research-grade-not-operational header; Otto-279 + Otto-256 history-surface name attribution; Otto-231 first-party consent. Integration work filed as task #292. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…stop-mythology discipline + tighter wording (Aaron 2026-04-28T21:15Z directive + Amara 21:14Z tiny-blade)
Aaron directive: 'we also stop mythology with human intellectual
lineage research and anchors.' The bead system + named classes
are operational scaffolding for THIS factory; the epistemic
claims the scaffolding rests on are external and need explicit
anchoring. Without these anchors, internal terminology becomes
its own self-justifying ritual.
Expanded External lineage section with specific cited works:
Falsifiability (Popper):
- Logic of Scientific Discovery (1934 / 1959 English)
- Conjectures and Refutations (1963)
Confirmation bias (Wason / Klayman & Ha):
- Wason 1960 (Quarterly Journal of Experimental Psychology)
- Klayman & Ha 1987 (Psychological Review) — positive test
strategy as failure mode bead audits guard against
Bayesian (factory-local heuristic, NOT externally-anchored):
- Bead-count thresholds are operational choices, not derived
from formal Bayesian model. Don't claim Bayesian rigor for
the threshold values.
Stop-mythology rule:
- Bead count statements: factory-local, no citation needed
- Why-beads-count-as-evidence claims: cite external lineage
- Generalized claims: SD-9 guardrail (substrate + lineage +
falsifier)
Composes with B-0060 (Human-Lineage External-Anchor Backfill,
P1) and task #292 (Aurora measurement hygiene).
Tightened wording (Amara tiny-blade): 'Confidence accumulates
through corroboration, never proof' overclaimed. Some local
substrate facts admit proof in narrow terms (grep matched, CI
failed, PR merged). Safer canonical wording:
'Confidence in reusable classes accumulates through
corroboration, not proof-by-count.'
This preserves the discipline (count of beads != proof of
class) without overclaiming about the philosophical status
of all knowledge.
Bundled into PR #694 rather than spawning a 6th sibling-DIRTY
round per Amara's 4-option mitigation (bundle related memory
rows when semantically coherent — the post-abort + rerere +
external-lineage tightenings are all about epistemic
discipline).
…stop-mythology discipline + tighter wording (Aaron 2026-04-28T21:15Z directive + Amara 21:14Z tiny-blade)
Aaron directive: 'we also stop mythology with human intellectual
lineage research and anchors.' The bead system + named classes
are operational scaffolding for THIS factory; the epistemic
claims the scaffolding rests on are external and need explicit
anchoring. Without these anchors, internal terminology becomes
its own self-justifying ritual.
Expanded External lineage section with specific cited works:
Falsifiability (Popper):
- Logic of Scientific Discovery (1934 / 1959 English)
- Conjectures and Refutations (1963)
Confirmation bias (Wason / Klayman & Ha):
- Wason 1960 (Quarterly Journal of Experimental Psychology)
- Klayman & Ha 1987 (Psychological Review) — positive test
strategy as failure mode bead audits guard against
Bayesian (factory-local heuristic, NOT externally-anchored):
- Bead-count thresholds are operational choices, not derived
from formal Bayesian model. Don't claim Bayesian rigor for
the threshold values.
Stop-mythology rule:
- Bead count statements: factory-local, no citation needed
- Why-beads-count-as-evidence claims: cite external lineage
- Generalized claims: SD-9 guardrail (substrate + lineage +
falsifier)
Composes with B-0060 (Human-Lineage External-Anchor Backfill,
P1) and task #292 (Aurora measurement hygiene).
Tightened wording (Amara tiny-blade): 'Confidence accumulates
through corroboration, never proof' overclaimed. Some local
substrate facts admit proof in narrow terms (grep matched, CI
failed, PR merged). Safer canonical wording:
'Confidence in reusable classes accumulates through
corroboration, not proof-by-count.'
This preserves the discipline (count of beads != proof of
class) without overclaiming about the philosophical status
of all knowledge.
Bundled into PR #694 rather than spawning a 6th sibling-DIRTY
round per Amara's 4-option mitigation (bundle related memory
rows when semantically coherent — the post-abort + rerere +
external-lineage tightenings are all about epistemic
discipline).
…Rerere Conflict-Resolution Cache Dividend (#694) * memory(post-interruption-pair): Post-Abort Dirty-Branch Resumption + Rerere Conflict-Resolution Cache Dividend (Amara naming 2026-04-28T20:55Z + tighter-phrasing 21:00Z) Two new Amara-named classes paired from this session's Aaron-stop + max-mode restart sequence: 1. Post-Abort Dirty-Branch Resumption: memory/feedback_post_abort_dirty_branch_resumption_amara_2026_04_28.md - Definition: after interrupted run, local branches may contain intact commits that were not pushed, leaving PRs DIRTY relative to main. Recovery requires inventory before new work, then serialized rebase/push/CI verification. - 8-step Amara-prescribed checklist - Tiny-blade: prefer `--force-with-lease` over plain `--force` in canonical recipes. Lease behavior refuses push if remote has moved unexpectedly; safer for multi-CLI / peer-agent trajectory. 2. Rerere Conflict-Resolution Cache Dividend: memory/feedback_rerere_conflict_resolution_cache_dividend_amara_2026_04_28.md - Definition: a repeated conflict pattern becomes cheaper after Git records a prior manual resolution and reuses it during later merges/rebases. - **Critical correction (Amara 21:00Z tighter phrasing)**: 'Recorded rerere resolutions persist as cache entries; abort clears the active rebase/merge resolution state.' NOT 'persistent cache survives abort' — that overclaims the boundary. - The wrong framing: 'previous abort taught rerere'. The right framing: 'previous completed resolution taught rerere; that recorded entry survives subsequent abort/restart cycles.' Worked example (this session's max-mode restart): - Aaron 20:53Z 'stop, going to upgrade to max mode' - Otto: `git rebase --abort` + `git checkout main` (clean) - Restart 20:56Z: branches still had unpushed commits, PRs DIRTY - Recovery: pull main → rebase → push --force-with-lease → CI re-arm - Rerere fired with 'Resolved memory/MEMORY.md using previous resolution' — recorded entries from earlier successful rebases this arc applied to the post-abort rebase Both classes earn 1 bead each via worked example this session. Both cross-reference each other. Bead audit overall this arc — explicit count per Class Validation Beads system landed in PR #693: - 6 classes at 1+ beads (this pair adds 2 more 1-bead classes) - Class-Naming Ferry Protocol still at 0 beads (meta-class; no direct validation event) - Prediction-Bearing Class Reuse + Class Validation Beads still at 0 beads (the validation system itself hasn't been externally validated yet) MEMORY.md index updated with single combined entry; paired-edit marker bumped to PR #694. No code-surface changes. * memory(rerere-cache-dividend): add bead-audit rule per Amara 2026-04-28T21:10Z Amara's tighter operational rule for the bead audit: Count only `Resolved '<path>' using previous resolution` as a rerere cache-hit bead. `Recorded preimage` and `Recorded resolution` are cache-write events: they create pending bead opportunities but do not themselves validate reuse. Background — applied to live evidence: Otto over-attributed beads on the restart sequence, claiming '3 cache-hit observations' when the actual rerere log lines were 1 cache-hit + 3 cache-writes. Amara's symmetric SD-9 endorsement of the wrong count was caught by independent verification of the log evidence, not by agreement-cycles. Corrected verified beads: 1 cache-hit (PR #693 commit 1). Pending beads: 3 cache-writes (PR #693 commit 2 + PR #690 + PR #694) — each earns a bead when a future rebase reuses the just-recorded resolution with 'Resolved using previous resolution' as the witness. Mechanism-Activity Validation Drift named as observation- level only (per Amara's recursion-risk caveat on meta-class proliferation); promotion deferred until a second independent example outside rerere demonstrates the same failure mode. The bead-audit rule generalizes: any class whose validation depends on mechanism-emitted log signals must distinguish activity-logs from validation-logs in its bead count. * memory(prediction-bearing-class-reuse): expand External lineage with stop-mythology discipline + tighter wording (Aaron 2026-04-28T21:15Z directive + Amara 21:14Z tiny-blade) Aaron directive: 'we also stop mythology with human intellectual lineage research and anchors.' The bead system + named classes are operational scaffolding for THIS factory; the epistemic claims the scaffolding rests on are external and need explicit anchoring. Without these anchors, internal terminology becomes its own self-justifying ritual. Expanded External lineage section with specific cited works: Falsifiability (Popper): - Logic of Scientific Discovery (1934 / 1959 English) - Conjectures and Refutations (1963) Confirmation bias (Wason / Klayman & Ha): - Wason 1960 (Quarterly Journal of Experimental Psychology) - Klayman & Ha 1987 (Psychological Review) — positive test strategy as failure mode bead audits guard against Bayesian (factory-local heuristic, NOT externally-anchored): - Bead-count thresholds are operational choices, not derived from formal Bayesian model. Don't claim Bayesian rigor for the threshold values. Stop-mythology rule: - Bead count statements: factory-local, no citation needed - Why-beads-count-as-evidence claims: cite external lineage - Generalized claims: SD-9 guardrail (substrate + lineage + falsifier) Composes with B-0060 (Human-Lineage External-Anchor Backfill, P1) and task #292 (Aurora measurement hygiene). Tightened wording (Amara tiny-blade): 'Confidence accumulates through corroboration, never proof' overclaimed. Some local substrate facts admit proof in narrow terms (grep matched, CI failed, PR merged). Safer canonical wording: 'Confidence in reusable classes accumulates through corroboration, not proof-by-count.' This preserves the discipline (count of beads != proof of class) without overclaiming about the philosophical status of all knowledge. Bundled into PR #694 rather than spawning a 6th sibling-DIRTY round per Amara's 4-option mitigation (bundle related memory rows when semantically coherent — the post-abort + rerere + external-lineage tightenings are all about epistemic discipline). * memory(class-validation): add Falsification Asymmetry + Bead Farming/Goodhart Risk guardrails (Gemini Deep Think 2026-04-28T21:18Z + Amara endorsed) Aaron forwarded a Gemini Deep Think review + Amara's synthesis. Two new guardrails accepted into the bead system to prevent it from becoming its own monotonic mythology: 1. Falsification Asymmetry (Gemini-named): - bead system must not be monotonic - high-bead class can still be broken by a hard falsifier - failure response: reset / bifurcate / retire - external lineage: Popper — corroboration is not proof; validation is additive, falsification is multiplicative by zero 2. Bead Farming / Goodhart Risk (Gemini-named): - synthetic friction (engineer scenarios to harvest beads) - retrofit narratives (claim bead for unrelated work) - bead-target prioritization over actual factory value - external lineage: Goodhart 1975 + Strathern 1997 + Campbell 1976 — when a measure becomes a target it ceases to be a good measure - detection: counterfactual test, action-shape test, synthetic-friction test - discipline: 'a bead must strictly represent the class/mechanism CAUSALLY steering the outcome' Unified canonical rule (Aaron 21:15Z + Amara/Gemini synthesis): 'A bead requires validation, not activity. A bead count increases confidence, not immunity. Hard falsifiers can override bead counts. Bead metrics must be guarded against Goodharting.' Per Amara correction: Mechanism-Activity Validation Drift remains observation-level (Gemini's recommendation to promote was rejected — state has moved past that; the local fix in the Rerere memory is sufficient for now). Per Aaron 21:15Z stop-mythology directive: external lineage section already expanded with specific cited works (Popper 1959/1963, Wason 1960, Klayman & Ha 1987). Added: Goodhart 1975, Strathern 1997, Campbell 1976. Frontmatter description updated with the four-line unified rule + the new guardrails. MEMORY.md index entry expanded to surface all four components of the discipline. Paired-edit marker bumped. * memory(amortized-precision): add positive complement of Goodhart Risk per Aaron 21:32Z + Amara 21:38Z compact-form correction Aaron 2026-04-28T21:32Z: 'amortized precision leads to momentum look at 6 sigma for proof and similar like kanban discipline.' Caught Otto's self-flagellation failure mode after the prior Goodhart-Risk correction: framing substrate work as 'drift away from 0/0/0' treats discipline-overhead as opposed to momentum. It isn't. It's the upfront tax that amortizes into compounding downstream rework reduction. The dual-constraint pair prevents oscillation: - Goodhart Risk: 'more process = more progress' (the failure mode the bead system already guards against). - Amortized Precision: 'process work is not real progress' (the mirror failure mode this section guards against). Distilled rule (Amara 21:38Z compact-form): Precision is not the enemy of momentum. Unamortized process is drag. Amortized precision is momentum. External lineage per Aaron's stop-mythology directive: - Six Sigma — Bill Smith / Motorola / 1986; DMAIC; 3.4 defects-per-million; upfront measurement amortizes to compounding downstream defect reduction. - Kanban (manufacturing) — Taiichi Ohno / Toyota / 1950s; WIP limits + pull system; throttle-look that increases throughput by reducing context-switching + queue depth. - Kanban (software) — David J. Anderson 2010 (Blue Hole Press); WIP-limit discipline yields faster cycle times in knowledge work. Falsifier: amortized precision fails when discipline-overhead grows faster than amortized savings, OR factory throughput drops despite growing discipline. Operational test: 'did the discipline-overhead this arc produce observable downstream throughput improvement?' Compact-form per Amara's 'do not fold a large new section' guidance — Amortized Precision fits in a tight subsection, not a mini-essay. Tiny-blade applied: 'dramatically' / 'exponentially' wording softened to 'compounding' / 'amortized' per Amara's word-choice correction. MEMORY.md index entry expanded with the 5th component + external-lineage anchors. Paired-edit marker NOT bumped (this amends in-flight PR #694; lint will re-run on the existing marker). * memory(rerere+post-abort): Copilot review fixes — rerere-must-be-enabled + broken cross-ref + MEMORY.md fast-path duplicate removal Addresses Copilot review threads on PR #694 (the highest-priority, factually-correctness ones): 1. **Rerere-must-be-enabled** (P1, factually wrong): The rerere memory file's claim that the cache dividend materializes was incomplete — Git's rerere does NOT run by default; it requires `git config --global rerere.enabled true`. Added explicit prerequisite section at the top of the file. 2. **Broken cross-reference** (P1): The rerere file referenced `memory/feedback_class_validation_beads...` (with literal ellipsis, unsearchable). Fixed to point at the actual canonical home `feedback_prediction_bearing_class_reuse_amara_2026_04_28.md` where the Class Validation Beads framework lives. 3. **MEMORY.md fast-path duplication** (P2): Removed two redundant `Fast path: read CURRENT-aaron.md...` markers added by this PR. The single canonical marker at line 3 is the intended single-slot latest-paired-edit pattern. P2 threads on doctrine refinement (exact-SHA leases, @{u} guards, fetch-before-comparing, git pull --ff-only avoidance) resolved with explanations: - **Bare --force-with-lease vs exact-SHA**: factory operationally uses bare lease form (verified working today: 4 rebases pushed clean). Exact-SHA form is stronger but adds invocation friction; the existing bare-lease form composes with the lease's built-in stale-assumption-rejection. Both forms acceptable; the existing guidance is operationally validated. - **@{u} no-upstream and fetch-before-compare**: valid refinement candidates for a follow-up; the current memory file's substance (8-step inventory-before-action checklist) holds; the specific command examples can be hardened in a follow-up tick without retracting the underlying class.
Summary
backlogdirective as a P2 research-grade BACKLOG row: catalogue the plugins the factory needs for Frontier UI + substrate (both.claude-plugin/and.codex-plugin/), restructure around skill-vs-plugin best practices, and enforce in-source-not-sandbox for all factory-authored plugins.backlogdirective and Aaron's mid-tick refinement that plugins are probably "just some sort of continer of our exsiting skills based on some orginalizaion groups" + explicit authorisation to research OpenAI + Anthropic plugin-design guides, or define factory best-practices if upstream is thin.Why this is P2 research-grade
Aaron's own framing: "big opportunity to restruture for new best practices and everyting else." Restructure-with-best-practices work doesn't fit P0/P1 (not blocking publication, not a correctness bug) but is substantive enough to warrant a design doc + Aminata threat pass + Aaron review before implementation. Matches PR #230 / PR #239 / PR #233 phase-gate pattern.
The in-source-not-sandbox hard requirement
Aaron's second concern: "we also wanna make sure our plugins are making it into source and not some harness sandbox." Harness-local plugin caches (
~/.claude/plugins/cache/**etc.) are per-user/per-machine ephemeral. Factory-authored plugins live in the Zeta repo. Third-party plugin consumption separate (still fine to enable Anthropic-distributed ones viaenabledPlugins).Phase gates (BLOCKING)
Test plan
docs/BACKLOG.mdedit preserves both Aaron verbatim quotes## P2 — research-gradesection before the "Otto acquires email" row🤖 Generated with Claude Code